Extraction of objects and page segmentation of composite documents with non-uniform background

نویسندگان

  • Yasser Alginahi
  • Maher A. Sid-Ahmed
  • Majid Ahmadi
چکیده

In designing page segmentation systems for documents with complex background and poor illumination, separating the background from the objects (text and images) is very crucial for the success of such system. The new local based neural binarization technique developed by the authors will be used to extract the objects from document images with complex backgrounds. This algorithm uses statistical and textural feature measures to obtain a feature vector for each pixel from a window of size ) 1 2 ( ) 1 2 ( + × + n n , where 1 ≥ n . These features provide a local understanding of pixels from their neighbourhoods making it easier to classify each pixel into its proper class. A Multi-Layer Perceptron Neural Network (MLP NN) is then used to classify each pixel in the image. The results of thresholding are then passed to a block segmentation stage. The block segmentation technique developed is a feature-based method that uses a Neural Network classifier to automatically segment and classify the image contents into text and halftone images. The results of page segmentation are then ready to be passed into an OCR system that will convert the text image into a format the can be stored and modified.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persian Printed Document Analysis and Page Segmentation

This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...

متن کامل

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

Segmentation Improvement of High Resolution Remote Sensing Images based on superpixels using Edge-based SLIC algorithm (E-SLIC)

The segmentation of high resolution remote sensing images is one of the most important analyses that play a significant role in the maximal and exact extraction of information.  There are different types of segmentation methods among which using  superpixels is one of the most important ones. Several methods have been proposed for extracting superpixels. Among the most successful ones, we can r...

متن کامل

Extracting Vessel Centerlines From Retinal Images Using Topographical Properties and Directional Filters

In this paper we consider the problem of blood vessel segmentation in retinal images. After enhancing the retinal image we use green channel of images for segmentation as it provides better discrimination between vessels and background. We consider the negative of retinal green channel image as a topographical surface and extract ridge points on this surface. The points with this property are l...

متن کامل

Object-Oriented Method for Automatic Extraction of Road from High Resolution Satellite Images

As the information carried in a high spatial resolution image is not represented by single pixels but by meaningful image objects, which include the association of multiple pixels and their mutual relations, the object based method has become one of the most commonly used strategies for the processing of high resolution imagery. This processing comprises two fundamental and critical steps towar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005